image

M51 galaxy Image source

Galaxy image classification¶

An introduction into galaxy classification¶

In the vast expanse of the cosmos, galaxies stand as the cosmic building blocks, each holding secrets that illuminate our understanding of the universe. But amidst this cosmic tapestry lies a puzzle waiting to be solved, a classification system that unlocks the secrets of these celestial entities.

Galaxies, sprawling collections of stars, gas, dust, and dark matter, come in a myriad of shapes, sizes, and structures. From majestic spirals with their sweeping arms to enigmatic ellipticals, galaxies captivate astronomers with their diversity. But understanding this diversity requires more than just appreciation; it necessitates organization, a way to categorize and classify these celestial giants. The genesis of galaxy classification traces back to the early 20th century, when astronomers like Edwin Hubble and others embarked on a quest to categorize galaxies. Their efforts birthed what is now known as the Hubble sequence, a classification scheme that divides galaxies into three main types: spirals, ellipticals, and irregulars. Spiral galaxies, with their pinwheel-like arms, showcase ongoing star formation and dynamic galactic disks. Elliptical galaxies, on the other hand, exhibit a smooth, football-like shape, hinting at their older stellar populations and minimal gas and dust content. Irregular galaxies defy convention, with chaotic, asymmetrical shapes born from galactic collisions and interactions.

image Image source

But why classify galaxies? What purpose does it serve beyond organizing celestial objects into neat categories? The answer lies in the insights it offers into galactic evolution, formation, and the very nature of the universe itself.Galaxy classification provides a window into the evolutionary pathways of these cosmic entities. Spiral galaxies, for instance, are often sites of active star formation, fueled by the presence of gas and dust in their disks. Meanwhile, elliptical galaxies, with their lack of prominent spiral arms, suggest a different evolutionary history, dominated by stellar aging and galactic mergers. Furthermore, galaxy classification sheds light on the processes that govern the formation of these cosmic structures. Spirals, believed to form from the gravitational collapse of gas clouds, represent a different formation mechanism from ellipticals, which likely arise from the merger of smaller galaxies. Irregular galaxies, with their chaotic shapes, offer clues to the role of galactic interactions in shaping the cosmic landscape. Beyond individual galaxies, classification enables astronomers to map the cosmic web, the intricate network of filaments and voids that weave through the universe. By studying the distribution of galaxy types across space, astronomers gain insights into the large-scale structure of the cosmos and the forces that shape it.

image Image source

In the grand tapestry of the universe, galaxy classification serves as a guiding thread, weaving together our understanding of cosmic evolution, formation, and structure. From the majestic spirals to the enigmatic ellipticals, galaxies offer a glimpse into the cosmic continuum, where past, present, and future merge in a celestial dance of cosmic proportions.

For more information about galaxy classification, wikipedia is a good start !

Scope of this project¶

In this project we will explore some techniques to classify the types of galaxies. Using the dataset (discribed in the next chapter) we will go thrugh data exploration, data transformation and a couple of models. In particular we will focus on predicting these 3 galaxy types:

  • E: Eliptical ;
  • S: Normal spiral;
  • SB: Barred spiral;

Project outputs¶

The outputs of this projects are :

  • EDA: Explore the dataset and devellop knowhow on the data
  • Data preprocessing: Use the knowhow obtained in EDA to prepare the data for models
  • Modeling: Build a model able to determine the type of galaxy, from a pre-determined list of galaxy types
  • Evaluation: Compare the different models

Dataset¶

The dataset contains images from SDSS originally ment for the Galaxy Zoo 2 project. During this project people manually voted on the galaxy classification for the complete dataset.

The images are available in different shapes:

  • 69 x 69;
  • 227 x 227;
  • 299 x 299;

An additional csv file contains the classification and some additional informations:

  • dr7objid: The id of the object in the datarelease 7;
  • asset_id: The image id;
  • gz2class: The galaxy classification;
  • total_classification: The number of classification;
  • total_votes: The number of votes for the classification;
  • agreement: The agreement of the classifications;

As we will see in the EDA the agreement is not always reliable. That means that our ground truth is somehow "messy".

Initialisation¶

In [ ]:
# General purpose libraries
import polars as pl 
import numpy as np
import warnings
import math 
from PIL import Image
from pathlib import Path 
from multiprocessing import Pool

# Plotting libraries
import plotly 
import plotly.express as px
import plotly.graph_objects as go 
from plotly.subplots import make_subplots
import plotly.io as pio 

# Model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam, RMSprop
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from keras import backend as K

# Merics
from sklearn.metrics import (confusion_matrix, 
                             accuracy_score, 
                             f1_score, 
                             recall_score, 
                             precision_score)
2024-04-24 23:11:50.746192: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-24 23:11:50.746289: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-24 23:11:50.886135: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
In [ ]:
# Setting the plotly theme
pio.templates.default = 'plotly_white'

# Filter warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
In [ ]:
# Define categories and encodding 
CATS = ['E', 'S', 'SB']
CATS_TO_IDX = {v: k for k, v in enumerate(CATS)}
IDX_TO_CATS = {k: v for k, v in enumerate(CATS)}

# Define categories colors 
CATS_COLORS = ['red', 'blue', 'green']
COLOR_MAP = {k: v for k, v in zip(CATS, CATS_COLORS)}

#Define images shapes 
SHAPES = ['small', 'medium', 'large']

# Data preparation
TEST_FRAC = 0.3

# Models params
IMG_PARAMS = {
    'small':{
        'height': 69, 
        'width': 69, 
        'channels': 3, 
    }, 
    'medium':{
        'height': 227, 
        'width': 227, 
        'channels': 3
    },
    'large':{
        'height': 299, 
        'width': 299, 
        'channels': 3
    }
}
TRAIN_PATH = Path('/kaggle/working/train')
TEST_PATH = Path('/kaggle/working/test')
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 32
TRAIN_TEST_SPLIT = 0.3
EPOCHS = 30

## RUN only 
RUN_ONLY = False
In [ ]:
project_path = Path('/kaggle/input/resized-reduced-gz2-images')
path_69 = project_path.joinpath('images_E_S_SB_69x69_a_03')
path_227 = project_path.joinpath('images_E_S_SB_227x227_a_03')
path_299 = project_path.joinpath('images_E_S_SB_299x299_a_03')
path_csv = project_path.joinpath('3class_map_a(p).csv')

Explorative data analysis¶

Dataset informations¶

In [ ]:
df = pl.read_csv(path_csv)
df.head()
Out[ ]:
shape: (5, 7)
dr7objidasset_idgz2classtotal_classificationstotal_votesagreement
i64i64i64stri64i64f64
058773259171489385158957"Sc+t"453421.0
1588009368545984617193641"Sb+t"423321.0
258773248435991351555934"Ei"361250.384527
3587741723357282317158501"Sc+t"282180.766954
4587738410866966577110939"Er"431510.399222
In [ ]:
df.describe()
Out[ ]:
shape: (9, 8)
statisticdr7objidasset_idgz2classtotal_classificationstotal_votesagreement
strf64f64f64strf64f64f64
"count"206168.0206168.0206168.0"206168"206168.0206168.0206168.0
"null_count"0.00.00.0"0"0.00.00.0
"mean"103083.55.8782e17141056.802763null42.622196184.1186940.43431
"std"59515.7194871.8327e1481082.330522null5.91069963.3288470.28728
"min"0.05.8772e173.0"Ec"16.032.00.0
"25%"51542.05.8773e1771370.0null39.0141.00.17935
"50%"103084.05.8774e17139530.0null43.0160.00.456436
"75%"154625.05.8774e17210903.0null46.0207.00.632321
"max"206167.05.8885e17295305.0"Sd?t(r)"79.0604.01.0

Looking at the dataframe summary, we can see that we don't have any null counts, so we won't have to worry about that. Let's take a look to two interesting columns:

  • total_votes:
    • The minimum number of votes is 32, which is a small number considering that these votes are our ground truth;
    • The mean number of votes is 184, which is already better;
  • agreement:
    • The minimum agreement is 0. We can't consider this as ground truth.
    • The mean agreement is around 0.43, which is also a bit low.

let's take a look to these columns visually:

In [ ]:
x_votes, y_votes = np.histogram(df['total_votes'], bins=10)
x_agree, y_agree = np.histogram(df['agreement'], bins=10)

fig1 = px.histogram(df, x='total_votes', title="Total votes histogram")
fig1.show()

fig2 = px.histogram(df, x='agreement', title="Agreement histogram")
fig2.show()

We can also take a look to the number of unique values in the galaxy classification column gz2class

In [ ]:
unique_class = df.n_unique(pl.col('gz2class'))
print(f'There are {unique_class} different galaxy categories')
There are 785 different galaxy categories

This is a lot ! This is because the classification system is a lot more complex (and messy) than the image showed in introduction. Let's take for example some galaxy class like Sb

In [ ]:
df.filter(pl.col('gz2class').str.starts_with('Sb')).unique('gz2class').head(10)
Out[ ]:
shape: (10, 7)
dr7objidasset_idgz2classtotal_classificationstotal_votesagreement
i64i64i64stri64i64f64
1650587739827667402818142832"Sb2l(l)"483500.25661
946658772938714742791334702"Sb1l(d)"392840.154263
643588017978367934481226963"Sb?l(i)"342440.082038
3140587739379918372992124278"Sb3t(o)"504141.0
136858773205002459552550108"Sb4m"231781.0
122358773189084237025849028"Sb2l(i)"342240.045566
938958773308080875119064368"Sb2t(d)"352760.795675
58315588023047474249814235667"Sb+m(l)"582140.349204
17132588017719576559713220028"Sb3m(i)"413251.0
1292587738410323214343110584"Sb2m(m)"462750.0

As mentionned in the introduction, we will foccus on high level classification: eliptical (E), normal spiral (S) and barred spiral (SB) galaxies. As the data is already sorted in folders into the dataset, we will get the classification information from the dataset directly

Dataset enrichment¶

Here we will infer the ground truth from the dataset folders

In [ ]:
def itterate_folder(folder):
    img_type = '.jpg'
    files = []
    for sub_folder in folder.iterdir(): 
        for active_folder in sub_folder.iterdir(): 
            files += [f for f in active_folder.iterdir() if f.suffix == img_type]
    return files 

def build_df(files):
    path = list(map(str, files))
    asset_id = list(map(lambda x: x.stem, files))
    return pl.DataFrame(dict(path=path, asset_id=asset_id))

small_files = build_df(itterate_folder(path_69)).rename({'path': 'path_small'})
medium_files = build_df(itterate_folder(path_227)).rename({'path': 'path_medium'})
large_files = build_df(itterate_folder(path_299)).rename({'path': 'path_large'})
In [ ]:
images_df = (
    small_files
    .join(medium_files, on='asset_id')
    .join(large_files, on='asset_id')
    .with_columns(
        target=pl.col('path_small').str.split("/").list.get(-2)
    )
    .select(['asset_id', 'target', 'path_small', 'path_medium', 'path_large'])
)
In [ ]:
df = (
    df
    .with_columns(pl.col('asset_id').cast(pl.Utf8))
    .join(images_df, on='asset_id', how='left')
)
df.describe()
Out[ ]:
shape: (9, 12)
statisticdr7objidasset_idgz2classtotal_classificationstotal_votesagreementtargetpath_smallpath_mediumpath_large
strf64f64strstrf64f64f64strstrstrstr
"count"206168.0206168.0"206168""206168"206168.0206168.0206168.0"133812""133812""133812""133812"
"null_count"0.00.0"0""0"0.00.00.0"72356""72356""72356""72356"
"mean"103083.55.8782e17nullnull42.622196184.1186940.43431nullnullnullnull
"std"59515.7194871.8327e14nullnull5.91069963.3288470.28728nullnullnullnull
"min"0.05.8772e17"100""Ec"16.032.00.0"E""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…
"25%"51542.05.8773e17nullnull39.0141.00.17935nullnullnullnull
"50%"103084.05.8774e17nullnull43.0160.00.456436nullnullnullnull
"75%"154625.05.8774e17nullnull46.0207.00.632321nullnullnullnull
"max"206167.05.8885e17"99999""Sd?t(r)"79.0604.01.0"SB""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…

They are some missing images in the dataset in comparison to the csv file. We can drop these lines in the dataframe as we don't have any images.

In [ ]:
df = df.drop_nulls()
print(f'The new dataset has {df.shape[0]} lines')
The new dataset has 133812 lines

Now that we have defined the targets in the dataset, we can take a look to the balance and distribution of the targets

In [ ]:
plot_df = df.group_by('target').len().sort('target')
fig = px.bar(plot_df, 
             x='target', 
             y='len', 
             title='Images count per target', 
             height=500, 
             width=500, 
             color='target', 
             color_discrete_map=COLOR_MAP, 
             opacity=0.6
            )
fig.update_layout(showlegend=False)

if not RUN_ONLY:
    fig.show()

The dataset is not balanced. We will need to take this into account for the data pre processing. We can also check for the agreement and total votes histograms for the targets.

In [ ]:
fig = go.Figure()
for cat, color in zip(CATS, CATS_COLORS):
    x = df.filter(pl.col('target') == cat)['total_votes']
    fig.add_trace(go.Histogram(x=x,
                               name=cat, 
                               marker_color=color,
                               opacity=0.6
                              )
                 )

fig.update_layout(barmode='overlay', 
                  title="Total votes histogram per target"
                 )
if not RUN_ONLY:
    fig.show()
In [ ]:
fig = go.Figure()
for cat, color in zip(CATS, CATS_COLORS):
    x = df.filter(pl.col('target') == cat)['agreement']
    fig.add_trace(go.Histogram(x=x,
                               name=cat, 
                               marker_color=color,
                               opacity=0.6
                              )
                 )

fig.update_layout(barmode='overlay', 
                  title="Agreement histogram per target"
                 )
if not RUN_ONLY:
    fig.show()

We have a lot less agreement for SB and S galaxies. We'll see if the computer is this has an impact on the model or not.

Images exploration¶

In this part we will explore the images

In [ ]:
def load_image(image_obj, as_image=False):
    """Load an image from path or str"""
    if isinstance(image_obj, (Path, str)):
        img = Image.open(image_obj)
        if not as_image:
            img = np.array(img)
    elif isinstance(image_obj, np.ndarray):
        img = image_obj
        if as_image: 
            raise ValueError
    else: 
        img = None
    return img



def plot_images(images, subtitles=None, title=None, maxrow = 5, grayscale=False, **kwargs): 
    """Utility to plot images"""
    images = list(map(load_image, images))
    if len(images) <= maxrow: 
        m = len(images)
        n = 1
    else: 
        m = maxrow
        n = math.ceil(len(images)/maxrow)
    
    fig = make_subplots(n, 
                        m, 
                        subplot_titles=subtitles, 
                        horizontal_spacing=0.05, 
                        vertical_spacing=0.05 )
    
    update_args = {}
    for idx, image in enumerate(images):
        row, col = divmod(idx, maxrow)
        if grayscale: 
            trace = go.Heatmap(z=image,
                               zmin=kwargs.get('zmin', 0), 
                               zmax=kwargs.get('zmin', 255), 
                               coloraxis='coloraxis'
                              )
            if idx == 0:
                ref = ''
            else:
                ref = str(idx+1)
            update_args[f'yaxis{ref}_scaleanchor'] = f"x{ref}"
        else: 
            trace = go.Image(z=image)
        fig.add_trace(trace, 
                      row+1, 
                      col+1, 
                     )
        
    if grayscale: 
        fig.update_layout(coloraxis = {'colorscale': kwargs.get('colorscale', 'viridis')})    
        fig.update_layout(**update_args)
        fig.update_xaxes(showgrid=False)
        fig.update_yaxes(showgrid=False)
        
    fig.update_layout(title=title, 
                      height=kwargs.get('height', 400), 
                      width=kwargs.get('width', 1200), 
                      autosize=kwargs.get('autosize', False))
    fig.update_yaxes(showticklabels=False)
    fig.update_xaxes(showticklabels=False)        
        
    return fig
                
In [ ]:
seed = 42
pictures = []
captions = []
for cat in CATS: 
    s = (
            df
            .filter(pl.col('target') == cat)
            .sample(1, seed=seed)
            .select(['dr7objid', 'target', 'path_small', 'path_medium', 'path_large'])
            .transpose()
            .get_column("column_0")
            .to_list()
        )
    pictures.extend(s[2:])
    captions.extend(map(lambda x: f'{s[0]} \n\n Cat: {s[1]}, shape: {x}', ['69x69', '227x227', '299x299']))

fig = plot_images(pictures, captions, title="Images overview", maxrow=3, width=1500, height=800)

if not RUN_ONLY:
    fig.show()

The difference between the quality is visible, as well as the difference between the different types. Let's show some samples of every category:

In [ ]:
samples = 3
figs = []
for cat in CATS: 
    s = (
            df
            .filter(pl.col('target') == cat)
            .sample(samples, seed=seed)
            .select(['dr7objid', 'path_large'])
            .to_dict(as_series=False)
        )
    pictures=s['path_large']
    captions=s['dr7objid']
    fig = plot_images(pictures, 
                      captions, 
                      title=f"Images overview: {cat}", 
                      maxrow=3, 
                      width=1500, 
                      height=800)
    figs.append(fig)
    
for fig in figs: 
    if not RUN_ONLY:
        fig.show()

A few obersvations here:

  • Some of the pictures are not so easy to categorize by eyes
  • There are soem other objects as well (stars or other very small galaxies)
  • Some of these galaxies are two different galaxies in collision (like M51 in the first image of this notebook

Images stats¶

Now let's take a look to the images themself and extract some usefull information

In [ ]:
def process_one(data):
    img = Image.open(data['path'])
    img_l = img.convert('L')
    img_arr = np.array(img)
    img_arr_l = np.array(img_l)
    means = img_arr.mean(axis=(0, 1))
    mean_l = img_arr_l.mean()
    return (data['target'], *means, mean_l, img_arr_l)

def batch_process(files):
    with Pool() as pool: 
        results = pool.map(process_one, files)
    return results 

def process_results(data):
    results = {}
    results['r_mean'] = list(map(lambda x: x[1], data))
    results['g_mean'] = list(map(lambda x: x[2], data))
    results['b_mean'] = list(map(lambda x: x[3], data))
    results['l_mean'] = list(map(lambda x: x[4], data))
    results['p_mean'] = np.mean(np.stack(list(map(lambda x: x[5], data)), axis=2), axis=2)
    return results 
In [ ]:
%%time
if not RUN_ONLY:
    files = df.select(['target', 'path_small']).rename({'path_small': 'path'}).to_dicts()
    results = batch_process(files) 

    imgs_stats = {}
    for cat in CATS:
        data = list(filter(lambda x: x[0] == cat, results))
        imgs_stats[cat] = process_results(data)  
CPU times: user 4.78 s, sys: 2.05 s, total: 6.83 s
Wall time: 3min 30s
In [ ]:
if not RUN_ONLY:
    images = list(map(lambda x: imgs_stats[x]['p_mean'], CATS))
    captions = list(map(lambda x: f'Category: {x}', CATS))
    plot_images(images, 
                subtitles=captions, 
                title="Mean gray values", 
                maxrow = 3, 
                grayscale=True, 
                width=1200, 
                zmin=0, 
                zmax=190,
                colorscale='Turbo'
               ).show()

The plot above shows the mean value of each pixel (in grayscale) from the complete dataset for each category. The galaxies are well centered into the images. We can also notice that the center of elliptic galaxies seems to be brighter then the others. It seems that the orientation of the images is always different.

Now let's look to the different color channels:

In [ ]:
def plot_color_hists(color, imgs_stats, title=''):
    fig = go.Figure() 
    traces = []
    for cat in CATS:
        x = imgs_stats[cat][color]
        trace = go.Histogram(x=x,
                             name=cat, 
                             marker_color=COLOR_MAP[cat], 
                             opacity=0.6, 
                             histnorm='percent'
                            )
        traces.append(trace)
    fig.add_traces(traces)
    fig.update_layout(barmode='overlay', 
                      title=title)
    return fig 
In [ ]:
if not RUN_ONLY:
    plot_color_hists('r_mean', imgs_stats, title="Color histogram, channel: R").show()
In [ ]:
if not RUN_ONLY:
    plot_color_hists('g_mean', imgs_stats, title="Color histogram, channel: G").show()
In [ ]:
if not RUN_ONLY:
    plot_color_hists('b_mean', imgs_stats, title="Color histogram, channel: B").show()
In [ ]:
if not RUN_ONLY:
    plot_color_hists('l_mean', imgs_stats, title="Color histogram, channel: L").show()

Be carefull when reading the last plots, the colors on the plot do not represent the channels but the galaxy category, there is one plot per channel Red, Green, Blue, Luminance (grayscale). Once again the S and SB galaxies looks very similar.

As we also saw in the images previews, the images are very dark. We can see this also in each channels histogram.

Pre processing¶

For this step I decided to use downsampling and data augmentation. We will first select some data and make a simlink to a work folder to access them with keras.preprocessing tool.

In [ ]:
def symlink_files(dests, source, shape, cat):
    #cat_idx = str(CATS_TO_IDX[cat])
    cat_idx = cat
    source_path = source.joinpath(shape).joinpath(cat_idx)
    source_path.mkdir(exist_ok=True, parents=True)
    dests = map(lambda x: Path(x), dests)
    for f in dests: 
        filename = f.name
        source = source_path.joinpath(filename)
        source.symlink_to(f)
    

def generate_infra(df, source, n_samples, test_frac=TEST_FRAC):
    train_frac = 1 - test_frac
    train_size = np.ceil(train_frac*n_samples).astype(int)
    test_size = n_samples - train_size
    folder_col = ['train']*train_size + ['test']*test_size
    for cat in CATS:
        df_res = (
            df
            .filter(pl.col('target') == cat)
            .sort('agreement', descending=True)
            .sample(n=n_samples, shuffle=True, seed=45)
            .select(['path_small', 'path_medium', 'path_large'])
            .with_columns(pl.Series(name='dest_folder', values=folder_col))
        )
        for folder in ['test', 'train']:
            folder_source = source.joinpath(folder)
            data = (
                df_res
                .filter(pl.col('dest_folder') == folder)
                .to_dict(as_series=False)
            )
            for shape in SHAPES:
                dests = data.get('path_' + shape)
                symlink_files(dests, folder_source, shape, cat)
In [ ]:
wk_path = Path('/kaggle/working/')
n_samples = df.filter(pl.col('target') == 'SB').shape[0]

try:
    generate_infra(df, wk_path, n_samples)
except FileExistsError:
    pass 
In [ ]:
for shape in SHAPES:
    p = TRAIN_PATH.joinpath(shape)
    for cat in p.iterdir():
        files = [f for f in cat.iterdir()]
        l = len(files)
        s = files[0].is_symlink()
        print(f'The dataset {shape} for category {cat.name} has {l} files. Symlink is {s}')
The dataset small for category E has 6982 files. Symlink is True
The dataset small for category S has 6982 files. Symlink is True
The dataset small for category SB has 6982 files. Symlink is True
The dataset medium for category E has 6982 files. Symlink is True
The dataset medium for category S has 6982 files. Symlink is True
The dataset medium for category SB has 6982 files. Symlink is True
The dataset large for category E has 6982 files. Symlink is True
The dataset large for category S has 6982 files. Symlink is True
The dataset large for category SB has 6982 files. Symlink is True

Model¶

Building model¶

Here we will define some functions to build the model that we can reuse for several models.

Dataloader¶

Here we define functions to load dataset, augment the data and rescale it

In [ ]:
def data_augmentation(imgs, augmentation_layers):
    """Utility to augment the data with a list of layers"""
    
    for layer in augmentation_layers:
        imgs = layer(imgs)
    return imgs

def load_ds(dataset_name, 
            validation_split=TRAIN_TEST_SPLIT, 
            tune_performance=True, 
            rescale=False, 
            augmentation_layers = None 
           ):
    """Utility to load training and validation dataset"""
    
    params = IMG_PARAMS.get(dataset_name)
    data_dir = TRAIN_PATH.joinpath(dataset_name)
    
    train_ds, val_ds = keras.utils.image_dataset_from_directory(
        directory=data_dir,
        labels='inferred', 
        color_mode='rgb', 
        batch_size=BATCH_SIZE, 
        image_size=(params['height'], params['width']), 
        validation_split=validation_split, 
        subset='both',
        shuffle=True, 
        seed=42
    )
    
    if tune_performance: 
        train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
        val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
        
    if augmentation_layers is not None: 
        train_ds = train_ds.map(
            lambda x, y: (data_augmentation(x, augmentation_layers), y), 
            num_parallel_calls=AUTOTUNE
        )
    
    if rescale: 
        normalization_layer = layers.Rescaling(1./255)
        train_ds = train_ds.map(lambda x, y: normalization_layer(x), y)
        val_ds = val_ds.map(lambda x, y: normalization_layer(x), y)
    
    return train_ds, val_ds         

Generic model builder¶

Here we will build model generator for different model architectures:

  • Simple CNN
  • Xception network explanation

(modified version from the keras example page)

In [ ]:
def cnn_model(dataset_name, 
              sizes,
              dropout = 0.2, 
              rescaling = False, 
              augmentation_layers = None
             ):
    """
    Build a CNN model
    """
    
    params = IMG_PARAMS.get(dataset_name)
    input_shape = (params['width'], params['height'], params['channels'])
   
    stack = []    
    stack.append(layers.Input(shape=input_shape, name='Input image'))
    
    if augmentation_layers is not None:
        for l in augmentation_layers: 
            stack.append(l)
    
    if rescaling:
        stack.append(layers.Rescaling(1.0/255, name='rescaling'))
    
    for i, size in enumerate(sizes):
        conv = layers.Conv2D(size, 
                             3,
                             padding='same',
                             activation='relu', 
                             name=f'convolution_{i}'
                            )
        stack.append(conv)
        stack.append(layers.MaxPooling2D(name=f'Pooling_{i}'))
    
    stack.append(layers.Dropout(dropout, name='Dropout'))
    stack.append(layers.Flatten(name='Flatten'))
    dense_1 = layers.Dense(sizes[-1]*2,
                           activation='relu', 
                           name='Dense_1'
                          )
    stack.append(dense_1)
    dense_2 = layers.Dense(3,
                           name='Outputs'
                          )
    stack.append(dense_2)
    
    model = keras.Sequential(stack)
    return model 
        
In [ ]:
def xception_model(dataset_name,
                   sizes, 
                   dropout = 0.2, 
                   rescaling = False, 
                   augmentation_layers = None
                  ):
    params = IMG_PARAMS.get(dataset_name)
    input_shape = (params['width'], params['height'], params['channels'])
    
    x = layers.Input(shape=input_shape, name='Input image')
    inputs = x
    
    if augmentation_layers is not None:
        for l in augmentation_layers:
            x = l(x)
    if rescaling:
        x = layers.Rescaling(1.0/255, name='Rescaling')(x)
    
    x = layers.Conv2D(sizes[0], 3, strides=2, padding="same")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)
    
    previous = x
    
    for size in sizes[1:-1]:
        x = layers.Activation("relu")(x)
        x = layers.SeparableConv2D(size, 3, padding="same")(x)
        x = layers.BatchNormalization()(x)

        x = layers.Activation("relu")(x)
        x = layers.SeparableConv2D(size, 3, padding="same")(x)
        x = layers.BatchNormalization()(x)

        x = layers.MaxPooling2D(3, strides=2, padding="same")(x)

        residual = layers.Conv2D(size, 1, strides=2, padding="same")(previous)
        x = layers.add([x, residual])  
        previous = x 
        
    x = layers.SeparableConv2D(sizes[-1], 3, padding="same")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)

    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(dropout)(x)
    outputs = layers.Dense(3, activation=None)(x)
    
    return keras.Model(inputs, outputs)  

Training¶

In [ ]:
class TimestampCallback(tf.keras.callbacks.Callback):
    def __init__(self, metric_name="duration"):
        self.__epoch_start = None
        self.__metric_name = metric_name
        
    def on_epoch_begin(self, epoch, logs=None):
        self.__epoch_start = tf.timestamp() 
    
    def on_epoch_end(self, epoch, logs=None):
        logs[self.__metric_name] = tf.timestamp() - self.__epoch_start
        
In [ ]:
def plot_convergence(history, 
                     title = "Training stats", 
                     train_only = ['learning_rate', 'duration'], 
                     include_training_time = True 
                    ):
    """
    Plot the training stats 
    """
    if not isinstance(history, dict):
        hist = history.history
    else: 
        hist = history
    metrics = list(hist.keys())
    train_metrics = list(filter(lambda x: x[:4] != 'val_', metrics))
    epochs = list(range(1, len(hist[metrics[0]]) + 1))
    
    fig = make_subplots(
        rows=1, 
        cols=len(train_metrics), 
        subplot_titles=train_metrics,     
    )    
    
    markers = ['triangle-up-open', 'circle-open']
    colors = ['red', 'blue']
    names = ['training', 'validation']
    
    for idx, metric in enumerate(train_metrics):
        show_legend = False
        if not idx: 
            show_legend = True
        for idy, data in enumerate([metric, 'val_' + metric]):
            if data[4:] in train_only:
                continue
            y = hist.get(data)
            fig.add_trace(
                go.Scatter(
                    x=epochs, 
                    y=y, 
                    mode='lines+markers', 
                    marker=dict(
                        symbol=markers[idy], 
                        color=colors[idy], 
                        size=8
                    ), 
                    name=names[idy], 
                    showlegend=show_legend
                ), 
                row=1, 
                col=idx+1
            )
            
    if include_training_time:
        time = hist.get("duration", [0])
        tot_time = np.sum(time).astype(int)
        title += f' Total training time: {tot_time} s'
        if len(time) < EPOCHS:
            title += " (Eearly stopping)"
        title += '.'
        
    fig.update_layout(
        template='plotly_white',
        title=title
    )

    return fig

CNN¶

In [ ]:
def run_cnn(shape, 
            sizes, 
            dropout, 
            augmentation_layers=None, 
            optimizer=None,
            metrics=['accuracy'],
            loss=None,
            epochs=EPOCHS,  
            callbacks=None,
            return_model = False
           ):
    """
    Build and run a model
    """
    train, val = load_ds(shape)
    
    model = cnn_model(shape,
                      sizes, 
                      dropout=dropout, 
                      rescaling=True, 
                      augmentation_layers=augmentation_layers
             )
    
    if optimizer is None: 
        optimizer = Adam(0.001)
    if loss is None: 
        loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    
    model.compile(optimizer=optimizer,
                  loss=loss, 
                  metrics=metrics)
    
    history = model.fit(train, 
                        validation_data=val, 
                        epochs=epochs, 
                        callbacks=callbacks
                       )
    if return_model: 
        return history.history, model, val
    return history.history
In [ ]:
image_aug_layers = [
    layers.RandomRotation(0.2), 
    layers.RandomZoom(0.3)
]
sizes = [16, 16, 16, 32, 32, 64]
dropout = 0.3

reduce_lr = ReduceLROnPlateau(patience=1, factor=0.5, min_lr=1e-6)
timestp = TimestampCallback()
early = EarlyStopping(patience=10, restore_best_weights=False, verbose=1)

metrics = ['accuracy']
callbacks = [reduce_lr, timestp]
In [ ]:
history_s = run_cnn('small', sizes=sizes, dropout=dropout, callbacks=callbacks)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/30
2024-04-24 23:16:33.627173: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9522: 4.12939, expected 3.20966
2024-04-24 23:16:33.627226: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9523: 5.91512, expected 4.99538
2024-04-24 23:16:33.627235: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9524: 5.91878, expected 4.99904
2024-04-24 23:16:33.627243: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9525: 6.10096, expected 5.18122
2024-04-24 23:16:33.627250: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9526: 5.50634, expected 4.58661
2024-04-24 23:16:33.627258: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9527: 5.67897, expected 4.75924
2024-04-24 23:16:33.627266: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9528: 6.38289, expected 5.46316
2024-04-24 23:16:33.627273: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9529: 5.33339, expected 4.41365
2024-04-24 23:16:33.627281: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9530: 5.37317, expected 4.45343
2024-04-24 23:16:33.627289: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9531: 4.1374, expected 3.21767
2024-04-24 23:16:33.628298: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[32,16,69,69]{3,2,1,0}, u8[0]{0}) custom-call(f32[32,3,69,69]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:16:33.628326: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:16:33.628334: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:16:33.628341: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:16:33.628348: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:16:33.628361: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-04-24 23:16:33.802748: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9522: 4.12939, expected 3.20966
2024-04-24 23:16:33.802807: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9523: 5.91512, expected 4.99538
2024-04-24 23:16:33.802817: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9524: 5.91878, expected 4.99904
2024-04-24 23:16:33.802825: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9525: 6.10096, expected 5.18122
2024-04-24 23:16:33.802832: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9526: 5.50634, expected 4.58661
2024-04-24 23:16:33.802840: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9527: 5.67897, expected 4.75924
2024-04-24 23:16:33.802850: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9528: 6.38289, expected 5.46316
2024-04-24 23:16:33.802858: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9529: 5.33339, expected 4.41365
2024-04-24 23:16:33.802865: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9530: 5.37317, expected 4.45343
2024-04-24 23:16:33.802873: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9531: 4.1374, expected 3.21767
2024-04-24 23:16:33.803403: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[32,16,69,69]{3,2,1,0}, u8[0]{0}) custom-call(f32[32,3,69,69]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:16:33.803438: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:16:33.803451: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:16:33.803463: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:16:33.803478: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:16:33.803495: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
 24/459 ━━━━━━━━━━━━━━━━━━━━ 3s 7ms/step - accuracy: 0.3551 - loss: 1.0982
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1714000596.889938      84 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
450/459 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.4419 - loss: 1.0430
2024-04-24 23:16:41.440815: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 19044: 3.18065, expected 2.73309
2024-04-24 23:16:41.440879: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 19112: 2.93988, expected 2.49232
2024-04-24 23:16:41.440898: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 19320: 3.4653, expected 3.01774
2024-04-24 23:16:41.440941: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 23736: 2.73582, expected 2.28826
2024-04-24 23:16:41.440986: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28566: 3.29053, expected 2.65257
2024-04-24 23:16:41.440997: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28567: 3.75902, expected 3.12106
2024-04-24 23:16:41.441007: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28568: 4.72746, expected 4.0895
2024-04-24 23:16:41.441019: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28569: 4.45585, expected 3.81789
2024-04-24 23:16:41.441032: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28570: 3.23928, expected 2.60131
2024-04-24 23:16:41.441044: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28571: 4.30002, expected 3.66206
2024-04-24 23:16:41.441066: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[7,16,69,69]{3,2,1,0}, u8[0]{0}) custom-call(f32[7,3,69,69]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:16:41.441081: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:16:41.441092: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:16:41.441103: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:16:41.441115: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:16:41.441139: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-04-24 23:16:41.466337: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 19044: 3.18065, expected 2.73309
2024-04-24 23:16:41.466380: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 19112: 2.93988, expected 2.49232
2024-04-24 23:16:41.466398: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 19320: 3.4653, expected 3.01774
2024-04-24 23:16:41.466444: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 23736: 2.73582, expected 2.28826
2024-04-24 23:16:41.466505: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28566: 3.29053, expected 2.65257
2024-04-24 23:16:41.466526: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28567: 3.75902, expected 3.12106
2024-04-24 23:16:41.466541: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28568: 4.72746, expected 4.0895
2024-04-24 23:16:41.466554: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28569: 4.45585, expected 3.81789
2024-04-24 23:16:41.466567: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28570: 3.23928, expected 2.60131
2024-04-24 23:16:41.466582: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 28571: 4.30002, expected 3.66206
2024-04-24 23:16:41.466609: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[7,16,69,69]{3,2,1,0}, u8[0]{0}) custom-call(f32[7,3,69,69]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:16:41.466626: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:16:41.466638: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:16:41.466648: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:16:41.466660: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:16:41.466676: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
459/459 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step - accuracy: 0.4429 - loss: 1.0422
2024-04-24 23:16:46.050366: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4761: 3.37658, expected 2.75903
2024-04-24 23:16:46.050422: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4762: 5.16998, expected 4.55243
2024-04-24 23:16:46.050432: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4764: 4.85262, expected 4.23507
2024-04-24 23:16:46.050440: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4765: 5.0726, expected 4.45505
2024-04-24 23:16:46.050447: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4766: 5.14326, expected 4.52571
2024-04-24 23:16:46.050455: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4769: 4.28812, expected 3.67057
2024-04-24 23:16:46.050462: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4770: 3.21708, expected 2.59953
2024-04-24 23:16:46.050470: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4771: 4.8195, expected 4.20195
2024-04-24 23:16:46.050478: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4773: 4.69615, expected 4.0786
2024-04-24 23:16:46.050485: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4774: 5.03699, expected 4.41944
2024-04-24 23:16:46.050500: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[11,16,69,69]{3,2,1,0}, u8[0]{0}) custom-call(f32[11,3,69,69]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:16:46.050509: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:16:46.050516: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:16:46.050522: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:16:46.050529: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:16:46.050540: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-04-24 23:16:46.076532: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4761: 3.37658, expected 2.75903
2024-04-24 23:16:46.076579: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4762: 5.16998, expected 4.55243
2024-04-24 23:16:46.076588: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4764: 4.85262, expected 4.23507
2024-04-24 23:16:46.076596: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4765: 5.0726, expected 4.45505
2024-04-24 23:16:46.076609: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4766: 5.14326, expected 4.52571
2024-04-24 23:16:46.076617: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4769: 4.28812, expected 3.67057
2024-04-24 23:16:46.076625: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4770: 3.21708, expected 2.59953
2024-04-24 23:16:46.076638: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4771: 4.8195, expected 4.20195
2024-04-24 23:16:46.076645: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4773: 4.69615, expected 4.0786
2024-04-24 23:16:46.076653: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4774: 5.03699, expected 4.41944
2024-04-24 23:16:46.076668: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[11,16,69,69]{3,2,1,0}, u8[0]{0}) custom-call(f32[11,3,69,69]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:16:46.076676: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:16:46.076683: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:16:46.076689: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:16:46.076696: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:16:46.076706: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
459/459 ━━━━━━━━━━━━━━━━━━━━ 17s 21ms/step - accuracy: 0.4430 - loss: 1.0421 - val_accuracy: 0.5292 - val_loss: 0.9670 - learning_rate: 0.0010 - duration: 16.7463
Epoch 2/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.5589 - loss: 0.9248 - val_accuracy: 0.5865 - val_loss: 0.8674 - learning_rate: 0.0010 - duration: 1.9970
Epoch 3/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.5962 - loss: 0.8624 - val_accuracy: 0.6097 - val_loss: 0.8344 - learning_rate: 0.0010 - duration: 1.9987
Epoch 4/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6146 - loss: 0.8297 - val_accuracy: 0.6015 - val_loss: 0.8359 - learning_rate: 0.0010 - duration: 2.0080
Epoch 5/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.6405 - loss: 0.7855 - val_accuracy: 0.6350 - val_loss: 0.7835 - learning_rate: 5.0000e-04 - duration: 2.0912
Epoch 6/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6502 - loss: 0.7617 - val_accuracy: 0.6436 - val_loss: 0.7672 - learning_rate: 5.0000e-04 - duration: 2.4400
Epoch 7/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6657 - loss: 0.7388 - val_accuracy: 0.6491 - val_loss: 0.7624 - learning_rate: 5.0000e-04 - duration: 1.9893
Epoch 8/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6842 - loss: 0.7146 - val_accuracy: 0.6519 - val_loss: 0.7623 - learning_rate: 5.0000e-04 - duration: 1.9963
Epoch 9/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6964 - loss: 0.6814 - val_accuracy: 0.6615 - val_loss: 0.7546 - learning_rate: 2.5000e-04 - duration: 2.0438
Epoch 10/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7075 - loss: 0.6626 - val_accuracy: 0.6643 - val_loss: 0.7494 - learning_rate: 2.5000e-04 - duration: 2.0149
Epoch 11/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7164 - loss: 0.6485 - val_accuracy: 0.6546 - val_loss: 0.7622 - learning_rate: 2.5000e-04 - duration: 1.9803
Epoch 12/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7224 - loss: 0.6315 - val_accuracy: 0.6696 - val_loss: 0.7430 - learning_rate: 1.2500e-04 - duration: 1.9868
Epoch 13/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7270 - loss: 0.6208 - val_accuracy: 0.6720 - val_loss: 0.7410 - learning_rate: 1.2500e-04 - duration: 2.0440
Epoch 14/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7343 - loss: 0.6105 - val_accuracy: 0.6732 - val_loss: 0.7465 - learning_rate: 1.2500e-04 - duration: 2.0091
Epoch 15/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7389 - loss: 0.6021 - val_accuracy: 0.6737 - val_loss: 0.7450 - learning_rate: 6.2500e-05 - duration: 1.9618
Epoch 16/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7437 - loss: 0.5963 - val_accuracy: 0.6763 - val_loss: 0.7429 - learning_rate: 3.1250e-05 - duration: 1.9589
Epoch 17/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7504 - loss: 0.5892 - val_accuracy: 0.6793 - val_loss: 0.7467 - learning_rate: 1.5625e-05 - duration: 1.9530
Epoch 18/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7519 - loss: 0.5859 - val_accuracy: 0.6801 - val_loss: 0.7492 - learning_rate: 7.8125e-06 - duration: 1.9533
Epoch 19/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7488 - loss: 0.5850 - val_accuracy: 0.6801 - val_loss: 0.7499 - learning_rate: 3.9063e-06 - duration: 1.9859
Epoch 20/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7478 - loss: 0.5833 - val_accuracy: 0.6791 - val_loss: 0.7495 - learning_rate: 1.9531e-06 - duration: 2.0122
Epoch 21/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7480 - loss: 0.5848 - val_accuracy: 0.6793 - val_loss: 0.7495 - learning_rate: 1.0000e-06 - duration: 2.0666
Epoch 22/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7477 - loss: 0.5855 - val_accuracy: 0.6790 - val_loss: 0.7494 - learning_rate: 1.0000e-06 - duration: 1.9945
Epoch 23/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7490 - loss: 0.5842 - val_accuracy: 0.6791 - val_loss: 0.7494 - learning_rate: 1.0000e-06 - duration: 1.9913
Epoch 24/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7525 - loss: 0.5838 - val_accuracy: 0.6793 - val_loss: 0.7496 - learning_rate: 1.0000e-06 - duration: 2.0077
Epoch 25/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7523 - loss: 0.5808 - val_accuracy: 0.6791 - val_loss: 0.7499 - learning_rate: 1.0000e-06 - duration: 1.9774
Epoch 26/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7482 - loss: 0.5809 - val_accuracy: 0.6791 - val_loss: 0.7498 - learning_rate: 1.0000e-06 - duration: 1.9981
Epoch 27/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7549 - loss: 0.5795 - val_accuracy: 0.6788 - val_loss: 0.7500 - learning_rate: 1.0000e-06 - duration: 1.9941
Epoch 28/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7538 - loss: 0.5808 - val_accuracy: 0.6790 - val_loss: 0.7500 - learning_rate: 1.0000e-06 - duration: 1.9836
Epoch 29/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.7512 - loss: 0.5830 - val_accuracy: 0.6790 - val_loss: 0.7501 - learning_rate: 1.0000e-06 - duration: 2.0738
Epoch 30/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7543 - loss: 0.5836 - val_accuracy: 0.6783 - val_loss: 0.7502 - learning_rate: 1.0000e-06 - duration: 1.9872
In [ ]:
plot_convergence(history=history_s)
In [ ]:
history_sa = run_cnn('small', sizes=sizes, dropout=dropout, callbacks=callbacks, augmentation_layers=image_aug_layers)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/30
2024-04-24 23:17:53.456865: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inStatefulPartitionedCall/sequential_1_1/Dropout_1/stateless_dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
459/459 ━━━━━━━━━━━━━━━━━━━━ 11s 14ms/step - accuracy: 0.4121 - loss: 1.0602 - val_accuracy: 0.5096 - val_loss: 0.9689 - learning_rate: 0.0010 - duration: 11.1980
Epoch 2/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.5418 - loss: 0.9471 - val_accuracy: 0.5693 - val_loss: 0.9000 - learning_rate: 0.0010 - duration: 3.5302
Epoch 3/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.5675 - loss: 0.9083 - val_accuracy: 0.5816 - val_loss: 0.8770 - learning_rate: 0.0010 - duration: 3.5331
Epoch 4/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.5790 - loss: 0.8917 - val_accuracy: 0.5948 - val_loss: 0.8532 - learning_rate: 0.0010 - duration: 3.5916
Epoch 5/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.5859 - loss: 0.8672 - val_accuracy: 0.5927 - val_loss: 0.8461 - learning_rate: 0.0010 - duration: 3.5321
Epoch 6/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.5985 - loss: 0.8526 - val_accuracy: 0.6207 - val_loss: 0.8179 - learning_rate: 0.0010 - duration: 3.5796
Epoch 7/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6060 - loss: 0.8331 - val_accuracy: 0.6156 - val_loss: 0.8414 - learning_rate: 0.0010 - duration: 3.5560
Epoch 8/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6298 - loss: 0.7967 - val_accuracy: 0.6503 - val_loss: 0.7656 - learning_rate: 5.0000e-04 - duration: 3.5678
Epoch 9/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6403 - loss: 0.7802 - val_accuracy: 0.6451 - val_loss: 0.7793 - learning_rate: 5.0000e-04 - duration: 3.6597
Epoch 10/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6480 - loss: 0.7615 - val_accuracy: 0.6624 - val_loss: 0.7373 - learning_rate: 2.5000e-04 - duration: 3.5612
Epoch 11/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6555 - loss: 0.7481 - val_accuracy: 0.6616 - val_loss: 0.7372 - learning_rate: 2.5000e-04 - duration: 3.5003
Epoch 12/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 3s 8ms/step - accuracy: 0.6692 - loss: 0.7311 - val_accuracy: 0.6613 - val_loss: 0.7336 - learning_rate: 1.2500e-04 - duration: 3.4744
Epoch 13/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6664 - loss: 0.7291 - val_accuracy: 0.6697 - val_loss: 0.7225 - learning_rate: 1.2500e-04 - duration: 3.5611
Epoch 14/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6673 - loss: 0.7255 - val_accuracy: 0.6747 - val_loss: 0.7179 - learning_rate: 1.2500e-04 - duration: 3.6061
Epoch 15/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6691 - loss: 0.7252 - val_accuracy: 0.6748 - val_loss: 0.7159 - learning_rate: 1.2500e-04 - duration: 3.5549
Epoch 16/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6746 - loss: 0.7123 - val_accuracy: 0.6728 - val_loss: 0.7143 - learning_rate: 1.2500e-04 - duration: 3.5368
Epoch 17/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6686 - loss: 0.7162 - val_accuracy: 0.6745 - val_loss: 0.7135 - learning_rate: 1.2500e-04 - duration: 3.5227
Epoch 18/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6717 - loss: 0.7103 - val_accuracy: 0.6777 - val_loss: 0.7049 - learning_rate: 1.2500e-04 - duration: 3.7102
Epoch 19/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6801 - loss: 0.7038 - val_accuracy: 0.6775 - val_loss: 0.7042 - learning_rate: 1.2500e-04 - duration: 3.5439
Epoch 20/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6864 - loss: 0.7013 - val_accuracy: 0.6809 - val_loss: 0.7046 - learning_rate: 1.2500e-04 - duration: 3.5292
Epoch 21/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6844 - loss: 0.6936 - val_accuracy: 0.6810 - val_loss: 0.6968 - learning_rate: 6.2500e-05 - duration: 3.5635
Epoch 22/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6887 - loss: 0.6940 - val_accuracy: 0.6836 - val_loss: 0.6990 - learning_rate: 6.2500e-05 - duration: 3.6448
Epoch 23/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6942 - loss: 0.6894 - val_accuracy: 0.6860 - val_loss: 0.6941 - learning_rate: 3.1250e-05 - duration: 3.6526
Epoch 24/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6920 - loss: 0.6892 - val_accuracy: 0.6836 - val_loss: 0.6951 - learning_rate: 3.1250e-05 - duration: 3.6231
Epoch 25/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6929 - loss: 0.6808 - val_accuracy: 0.6865 - val_loss: 0.6917 - learning_rate: 1.5625e-05 - duration: 3.5664
Epoch 26/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6869 - loss: 0.6853 - val_accuracy: 0.6869 - val_loss: 0.6927 - learning_rate: 1.5625e-05 - duration: 3.7454
Epoch 27/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6899 - loss: 0.6857 - val_accuracy: 0.6880 - val_loss: 0.6910 - learning_rate: 7.8125e-06 - duration: 3.5356
Epoch 28/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6902 - loss: 0.6866 - val_accuracy: 0.6884 - val_loss: 0.6905 - learning_rate: 7.8125e-06 - duration: 3.5494
Epoch 29/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6888 - loss: 0.6886 - val_accuracy: 0.6903 - val_loss: 0.6909 - learning_rate: 7.8125e-06 - duration: 3.5884
Epoch 30/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6923 - loss: 0.6855 - val_accuracy: 0.6887 - val_loss: 0.6906 - learning_rate: 3.9063e-06 - duration: 3.5858
In [ ]:
plot_convergence(history=history_sa)
In [ ]:
history_m = run_cnn('medium', sizes=sizes, dropout=dropout, callbacks=callbacks)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/30
2024-04-24 23:20:06.940604: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154587: 2.91214, expected 2.33608
2024-04-24 23:20:06.940669: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154589: 4.58643, expected 4.01038
2024-04-24 23:20:06.940679: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154590: 3.79635, expected 3.2203
2024-04-24 23:20:06.940686: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154591: 4.70247, expected 4.12642
2024-04-24 23:20:06.940694: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154592: 4.03855, expected 3.46249
2024-04-24 23:20:06.940702: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154594: 4.19126, expected 3.61521
2024-04-24 23:20:06.940709: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154595: 4.6629, expected 4.08684
2024-04-24 23:20:06.940717: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154596: 4.6503, expected 4.07425
2024-04-24 23:20:06.940724: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154597: 4.66434, expected 4.08829
2024-04-24 23:20:06.940732: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154599: 4.14528, expected 3.56922
2024-04-24 23:20:06.952656: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[32,16,227,227]{3,2,1,0}, u8[0]{0}) custom-call(f32[32,3,227,227]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:20:06.952709: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:20:06.952718: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:20:06.952725: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:20:06.952733: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:20:06.952750: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-04-24 23:20:07.325494: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154587: 2.91214, expected 2.33608
2024-04-24 23:20:07.325551: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154589: 4.58643, expected 4.01038
2024-04-24 23:20:07.325560: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154590: 3.79635, expected 3.2203
2024-04-24 23:20:07.325568: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154591: 4.70247, expected 4.12642
2024-04-24 23:20:07.325576: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154592: 4.03855, expected 3.46249
2024-04-24 23:20:07.325584: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154594: 4.19126, expected 3.61521
2024-04-24 23:20:07.325591: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154595: 4.6629, expected 4.08684
2024-04-24 23:20:07.325599: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154596: 4.6503, expected 4.07425
2024-04-24 23:20:07.325607: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154597: 4.66434, expected 4.08829
2024-04-24 23:20:07.325614: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154599: 4.14528, expected 3.56922
2024-04-24 23:20:07.337192: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[32,16,227,227]{3,2,1,0}, u8[0]{0}) custom-call(f32[32,3,227,227]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:20:07.337235: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:20:07.337243: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:20:07.337250: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:20:07.337257: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:20:07.337273: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
456/459 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step - accuracy: 0.4578 - loss: 1.0121
2024-04-24 23:20:33.666921: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51529: 2.68601, expected 2.09069
2024-04-24 23:20:33.666980: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51530: 4.50539, expected 3.91007
2024-04-24 23:20:33.666998: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51531: 3.61242, expected 3.0171
2024-04-24 23:20:33.667010: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51532: 3.91043, expected 3.31511
2024-04-24 23:20:33.667021: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51533: 4.52412, expected 3.9288
2024-04-24 23:20:33.667034: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51534: 4.1609, expected 3.56558
2024-04-24 23:20:33.667046: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51535: 4.43227, expected 3.83695
2024-04-24 23:20:33.667057: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51536: 4.07472, expected 3.4794
2024-04-24 23:20:33.667069: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51538: 4.02868, expected 3.43336
2024-04-24 23:20:33.667099: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51539: 4.76487, expected 4.16955
2024-04-24 23:20:33.668998: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[7,16,227,227]{3,2,1,0}, u8[0]{0}) custom-call(f32[7,3,227,227]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:20:33.669035: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:20:33.669049: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:20:33.669059: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:20:33.669068: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:20:33.669084: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-04-24 23:20:33.747050: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51529: 2.68601, expected 2.09069
2024-04-24 23:20:33.747115: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51530: 4.50539, expected 3.91007
2024-04-24 23:20:33.747133: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51531: 3.61242, expected 3.0171
2024-04-24 23:20:33.747156: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51532: 3.91043, expected 3.31511
2024-04-24 23:20:33.747183: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51533: 4.52412, expected 3.9288
2024-04-24 23:20:33.747194: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51534: 4.1609, expected 3.56558
2024-04-24 23:20:33.747204: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51535: 4.43227, expected 3.83695
2024-04-24 23:20:33.747217: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51536: 4.07472, expected 3.4794
2024-04-24 23:20:33.747230: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51538: 4.02868, expected 3.43336
2024-04-24 23:20:33.747243: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51539: 4.76487, expected 4.16955
2024-04-24 23:20:33.747274: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[7,16,227,227]{3,2,1,0}, u8[0]{0}) custom-call(f32[7,3,227,227]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:20:33.747301: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:20:33.747327: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:20:33.747353: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:20:33.747368: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:20:33.747389: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
459/459 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - accuracy: 0.4582 - loss: 1.0118
2024-04-24 23:20:46.598053: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51569: 3.23824, expected 2.79087
2024-04-24 23:20:46.598119: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51755: 2.81694, expected 2.36957
2024-04-24 23:20:46.598258: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 80357: 3.31329, expected 2.86592
2024-04-24 23:20:46.598353: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 102831: 3.24935, expected 2.80198
2024-04-24 23:20:46.598725: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154587: 3.18783, expected 2.65505
2024-04-24 23:20:46.598758: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154588: 4.0353, expected 3.50252
2024-04-24 23:20:46.598773: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154591: 3.77932, expected 3.24654
2024-04-24 23:20:46.598786: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154600: 4.0984, expected 3.56562
2024-04-24 23:20:46.598798: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154601: 4.04705, expected 3.51427
2024-04-24 23:20:46.598811: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154612: 3.68866, expected 3.15588
2024-04-24 23:20:46.603015: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[11,16,227,227]{3,2,1,0}, u8[0]{0}) custom-call(f32[11,3,227,227]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:20:46.603046: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:20:46.603054: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:20:46.603061: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:20:46.603068: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:20:46.603083: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-04-24 23:20:46.728011: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51569: 3.23824, expected 2.79087
2024-04-24 23:20:46.728070: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 51755: 2.81694, expected 2.36957
2024-04-24 23:20:46.728255: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 80357: 3.31329, expected 2.86592
2024-04-24 23:20:46.728410: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 102831: 3.24935, expected 2.80198
2024-04-24 23:20:46.728781: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154587: 3.18783, expected 2.65505
2024-04-24 23:20:46.728805: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154588: 4.0353, expected 3.50252
2024-04-24 23:20:46.728813: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154591: 3.77932, expected 3.24654
2024-04-24 23:20:46.728821: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154600: 4.0984, expected 3.56562
2024-04-24 23:20:46.728829: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154601: 4.04705, expected 3.51427
2024-04-24 23:20:46.728836: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 154612: 3.68866, expected 3.15588
2024-04-24 23:20:46.732982: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[11,16,227,227]{3,2,1,0}, u8[0]{0}) custom-call(f32[11,3,227,227]{3,2,1,0}, f32[16,3,3,3]{3,2,1,0}, f32[16]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-04-24 23:20:46.733012: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-04-24 23:20:46.733021: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-04-24 23:20:46.733028: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-04-24 23:20:46.733036: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-04-24 23:20:46.733051: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
459/459 ━━━━━━━━━━━━━━━━━━━━ 44s 79ms/step - accuracy: 0.4584 - loss: 1.0117 - val_accuracy: 0.5865 - val_loss: 0.8798 - learning_rate: 0.0010 - duration: 43.9605
Epoch 2/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.5727 - loss: 0.8899 - val_accuracy: 0.6010 - val_loss: 0.8803 - learning_rate: 0.0010 - duration: 9.1942
Epoch 3/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 10s 21ms/step - accuracy: 0.5973 - loss: 0.8505 - val_accuracy: 0.6110 - val_loss: 0.8353 - learning_rate: 5.0000e-04 - duration: 9.6535
Epoch 4/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 10s 21ms/step - accuracy: 0.6095 - loss: 0.8381 - val_accuracy: 0.6218 - val_loss: 0.8344 - learning_rate: 5.0000e-04 - duration: 9.8203
Epoch 5/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 10s 21ms/step - accuracy: 0.6194 - loss: 0.8183 - val_accuracy: 0.6257 - val_loss: 0.8327 - learning_rate: 5.0000e-04 - duration: 9.7431
Epoch 6/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6261 - loss: 0.8063 - val_accuracy: 0.6263 - val_loss: 0.8148 - learning_rate: 5.0000e-04 - duration: 9.2613
Epoch 7/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6435 - loss: 0.7882 - val_accuracy: 0.6263 - val_loss: 0.8253 - learning_rate: 5.0000e-04 - duration: 9.1843
Epoch 8/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 10s 20ms/step - accuracy: 0.6504 - loss: 0.7638 - val_accuracy: 0.6379 - val_loss: 0.7968 - learning_rate: 2.5000e-04 - duration: 10.2215
Epoch 9/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6524 - loss: 0.7506 - val_accuracy: 0.6417 - val_loss: 0.7940 - learning_rate: 2.5000e-04 - duration: 9.1427
Epoch 10/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6656 - loss: 0.7384 - val_accuracy: 0.6443 - val_loss: 0.8009 - learning_rate: 2.5000e-04 - duration: 9.1684
Epoch 11/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6700 - loss: 0.7206 - val_accuracy: 0.6433 - val_loss: 0.7907 - learning_rate: 1.2500e-04 - duration: 9.1523
Epoch 12/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6794 - loss: 0.7115 - val_accuracy: 0.6448 - val_loss: 0.7865 - learning_rate: 1.2500e-04 - duration: 9.1596
Epoch 13/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 10s 20ms/step - accuracy: 0.6793 - loss: 0.7074 - val_accuracy: 0.6446 - val_loss: 0.7849 - learning_rate: 1.2500e-04 - duration: 10.2146
Epoch 14/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6781 - loss: 0.6963 - val_accuracy: 0.6481 - val_loss: 0.7862 - learning_rate: 1.2500e-04 - duration: 9.1379
Epoch 15/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6888 - loss: 0.6861 - val_accuracy: 0.6473 - val_loss: 0.7838 - learning_rate: 6.2500e-05 - duration: 9.1637
Epoch 16/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6923 - loss: 0.6812 - val_accuracy: 0.6475 - val_loss: 0.7846 - learning_rate: 6.2500e-05 - duration: 9.1452
Epoch 17/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 10s 20ms/step - accuracy: 0.6928 - loss: 0.6789 - val_accuracy: 0.6498 - val_loss: 0.7802 - learning_rate: 3.1250e-05 - duration: 10.2268
Epoch 18/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7041 - loss: 0.6664 - val_accuracy: 0.6483 - val_loss: 0.7805 - learning_rate: 3.1250e-05 - duration: 9.2024
Epoch 19/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7041 - loss: 0.6660 - val_accuracy: 0.6494 - val_loss: 0.7790 - learning_rate: 1.5625e-05 - duration: 9.1964
Epoch 20/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7081 - loss: 0.6648 - val_accuracy: 0.6489 - val_loss: 0.7794 - learning_rate: 1.5625e-05 - duration: 9.2082
Epoch 21/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6957 - loss: 0.6677 - val_accuracy: 0.6500 - val_loss: 0.7786 - learning_rate: 7.8125e-06 - duration: 9.2027
Epoch 22/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.6998 - loss: 0.6629 - val_accuracy: 0.6491 - val_loss: 0.7785 - learning_rate: 7.8125e-06 - duration: 9.1241
Epoch 23/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7008 - loss: 0.6586 - val_accuracy: 0.6500 - val_loss: 0.7785 - learning_rate: 3.9063e-06 - duration: 9.1428
Epoch 24/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7032 - loss: 0.6578 - val_accuracy: 0.6494 - val_loss: 0.7777 - learning_rate: 1.9531e-06 - duration: 9.1529
Epoch 25/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7039 - loss: 0.6588 - val_accuracy: 0.6487 - val_loss: 0.7779 - learning_rate: 1.9531e-06 - duration: 9.3384
Epoch 26/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7034 - loss: 0.6618 - val_accuracy: 0.6487 - val_loss: 0.7775 - learning_rate: 1.0000e-06 - duration: 9.3339
Epoch 27/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7016 - loss: 0.6609 - val_accuracy: 0.6491 - val_loss: 0.7774 - learning_rate: 1.0000e-06 - duration: 9.2386
Epoch 28/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7030 - loss: 0.6601 - val_accuracy: 0.6491 - val_loss: 0.7774 - learning_rate: 1.0000e-06 - duration: 9.2533
Epoch 29/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7101 - loss: 0.6575 - val_accuracy: 0.6487 - val_loss: 0.7775 - learning_rate: 1.0000e-06 - duration: 9.1664
Epoch 30/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 9s 20ms/step - accuracy: 0.7010 - loss: 0.6589 - val_accuracy: 0.6487 - val_loss: 0.7774 - learning_rate: 1.0000e-06 - duration: 9.1563
In [ ]:
plot_convergence(history=history_m)
In [ ]:
history_ma = run_cnn('medium', sizes=sizes, dropout=dropout, callbacks=callbacks, augmentation_layers=image_aug_layers)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/30
2024-04-24 23:25:29.863784: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inStatefulPartitionedCall/sequential_3_1/Dropout_1/stateless_dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
459/459 ━━━━━━━━━━━━━━━━━━━━ 21s 38ms/step - accuracy: 0.4225 - loss: 1.0442 - val_accuracy: 0.5499 - val_loss: 0.9368 - learning_rate: 0.0010 - duration: 20.9867
Epoch 2/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.5540 - loss: 0.9319 - val_accuracy: 0.5577 - val_loss: 0.9140 - learning_rate: 0.0010 - duration: 14.2243
Epoch 3/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.5687 - loss: 0.9082 - val_accuracy: 0.5790 - val_loss: 0.8711 - learning_rate: 0.0010 - duration: 14.1725
Epoch 4/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.5895 - loss: 0.8796 - val_accuracy: 0.6061 - val_loss: 0.8360 - learning_rate: 0.0010 - duration: 14.2023
Epoch 5/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.5945 - loss: 0.8631 - val_accuracy: 0.6230 - val_loss: 0.8111 - learning_rate: 0.0010 - duration: 14.1629
Epoch 6/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 20s 31ms/step - accuracy: 0.6022 - loss: 0.8446 - val_accuracy: 0.6190 - val_loss: 0.8175 - learning_rate: 0.0010 - duration: 20.4890
Epoch 7/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6295 - loss: 0.7986 - val_accuracy: 0.6387 - val_loss: 0.7831 - learning_rate: 5.0000e-04 - duration: 14.1430
Epoch 8/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6395 - loss: 0.7842 - val_accuracy: 0.6527 - val_loss: 0.7533 - learning_rate: 5.0000e-04 - duration: 14.1942
Epoch 9/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6536 - loss: 0.7650 - val_accuracy: 0.6494 - val_loss: 0.7640 - learning_rate: 5.0000e-04 - duration: 14.2010
Epoch 10/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6536 - loss: 0.7477 - val_accuracy: 0.6608 - val_loss: 0.7278 - learning_rate: 2.5000e-04 - duration: 14.1983
Epoch 11/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6674 - loss: 0.7341 - val_accuracy: 0.6713 - val_loss: 0.7141 - learning_rate: 2.5000e-04 - duration: 14.1929
Epoch 12/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6703 - loss: 0.7174 - val_accuracy: 0.6758 - val_loss: 0.7108 - learning_rate: 2.5000e-04 - duration: 14.1812
Epoch 13/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6698 - loss: 0.7133 - val_accuracy: 0.6836 - val_loss: 0.6983 - learning_rate: 2.5000e-04 - duration: 14.2263
Epoch 14/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 20s 31ms/step - accuracy: 0.6799 - loss: 0.7049 - val_accuracy: 0.6904 - val_loss: 0.6913 - learning_rate: 2.5000e-04 - duration: 20.4639
Epoch 15/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 20s 31ms/step - accuracy: 0.6826 - loss: 0.7009 - val_accuracy: 0.6895 - val_loss: 0.6888 - learning_rate: 2.5000e-04 - duration: 20.3917
Epoch 16/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6893 - loss: 0.6861 - val_accuracy: 0.6989 - val_loss: 0.6750 - learning_rate: 2.5000e-04 - duration: 14.1831
Epoch 17/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6952 - loss: 0.6815 - val_accuracy: 0.7008 - val_loss: 0.6737 - learning_rate: 2.5000e-04 - duration: 14.2185
Epoch 18/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6974 - loss: 0.6804 - val_accuracy: 0.7014 - val_loss: 0.6672 - learning_rate: 2.5000e-04 - duration: 14.1582
Epoch 19/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.6982 - loss: 0.6653 - val_accuracy: 0.7064 - val_loss: 0.6630 - learning_rate: 2.5000e-04 - duration: 14.1752
Epoch 20/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7051 - loss: 0.6654 - val_accuracy: 0.7113 - val_loss: 0.6489 - learning_rate: 2.5000e-04 - duration: 14.1557
Epoch 21/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7089 - loss: 0.6570 - val_accuracy: 0.7073 - val_loss: 0.6672 - learning_rate: 2.5000e-04 - duration: 14.1679
Epoch 22/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7062 - loss: 0.6602 - val_accuracy: 0.7140 - val_loss: 0.6434 - learning_rate: 1.2500e-04 - duration: 14.1312
Epoch 23/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7125 - loss: 0.6460 - val_accuracy: 0.7223 - val_loss: 0.6315 - learning_rate: 1.2500e-04 - duration: 14.1623
Epoch 24/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 20s 31ms/step - accuracy: 0.7131 - loss: 0.6437 - val_accuracy: 0.7191 - val_loss: 0.6348 - learning_rate: 1.2500e-04 - duration: 20.4598
Epoch 25/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7194 - loss: 0.6355 - val_accuracy: 0.7218 - val_loss: 0.6280 - learning_rate: 6.2500e-05 - duration: 14.2099
Epoch 26/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7229 - loss: 0.6284 - val_accuracy: 0.7227 - val_loss: 0.6268 - learning_rate: 6.2500e-05 - duration: 14.1481
Epoch 27/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 21s 31ms/step - accuracy: 0.7178 - loss: 0.6338 - val_accuracy: 0.7223 - val_loss: 0.6250 - learning_rate: 6.2500e-05 - duration: 20.5192
Epoch 28/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7158 - loss: 0.6270 - val_accuracy: 0.7248 - val_loss: 0.6247 - learning_rate: 6.2500e-05 - duration: 14.1742
Epoch 29/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7278 - loss: 0.6264 - val_accuracy: 0.7242 - val_loss: 0.6209 - learning_rate: 6.2500e-05 - duration: 14.1787
Epoch 30/30
459/459 ━━━━━━━━━━━━━━━━━━━━ 14s 31ms/step - accuracy: 0.7175 - loss: 0.6309 - val_accuracy: 0.7223 - val_loss: 0.6213 - learning_rate: 6.2500e-05 - duration: 14.1721
In [ ]:
plot_convergence(history=history_ma)

Xception¶

In [ ]:
def run_xception(shape, 
                 sizes, 
                 dropout, 
                 augmentation_layers=None, 
                 optimizer=None,
                 metrics=['accuracy'],
                 loss=None,
                 epochs=EPOCHS,  
                 callbacks=None, 
                 return_model=False
                ):
    """
    Build and run a Xception model
    """
    train, val = load_ds(shape, augmentation_layers=augmentation_layers)
    
    model = xception_model(shape,
                           sizes, 
                           dropout=dropout, 
                           rescaling=True, 
             )
    
    if optimizer is None: 
        optimizer = Adam(0.001)
    if loss is None: 
        loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    
    model.compile(optimizer=optimizer,
                  loss=loss, 
                  metrics=metrics)
    
    history = model.fit(train, 
                        validation_data=val, 
                        epochs=epochs, 
                        callbacks=callbacks
                       )
    if return_model:
        return history.history, model
    return history.history
In [ ]:
image_aug_layers = [
    layers.RandomRotation(0.2), 
    layers.RandomZoom(0.3)
]
sizes = [128, 256, 512, 728, 1024]
dropout = 0.25

reduce_lr = ReduceLROnPlateau(patience=1, factor=0.5, min_lr=1e-6)
timestp = TimestampCallback()
early = EarlyStopping(patience=10, restore_best_weights=False, verbose=1)

metrics = ['accuracy']
callbacks = [reduce_lr, timestp]
In [ ]:
history_s_x = run_xception('small', sizes, dropout, callbacks=callbacks, epochs=15)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 46s 60ms/step - accuracy: 0.5302 - loss: 0.9612 - val_accuracy: 0.3357 - val_loss: 1.7060 - learning_rate: 0.0010 - duration: 45.7247
Epoch 2/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.6407 - loss: 0.7882 - val_accuracy: 0.4527 - val_loss: 1.6406 - learning_rate: 0.0010 - duration: 14.7994
Epoch 3/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.6787 - loss: 0.7210 - val_accuracy: 0.5453 - val_loss: 1.4694 - learning_rate: 0.0010 - duration: 14.7674
Epoch 4/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.7029 - loss: 0.6746 - val_accuracy: 0.6026 - val_loss: 1.0970 - learning_rate: 0.0010 - duration: 14.7735
Epoch 5/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.7278 - loss: 0.6243 - val_accuracy: 0.4869 - val_loss: 1.6485 - learning_rate: 0.0010 - duration: 14.6799
Epoch 6/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.7605 - loss: 0.5588 - val_accuracy: 0.7073 - val_loss: 0.7285 - learning_rate: 5.0000e-04 - duration: 14.7122
Epoch 7/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.7985 - loss: 0.4871 - val_accuracy: 0.7114 - val_loss: 0.8581 - learning_rate: 5.0000e-04 - duration: 14.7108
Epoch 8/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.8490 - loss: 0.3807 - val_accuracy: 0.6871 - val_loss: 0.9186 - learning_rate: 2.5000e-04 - duration: 14.7464
Epoch 9/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9055 - loss: 0.2422 - val_accuracy: 0.6974 - val_loss: 1.1072 - learning_rate: 1.2500e-04 - duration: 14.6594
Epoch 10/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9610 - loss: 0.1180 - val_accuracy: 0.7143 - val_loss: 1.1015 - learning_rate: 6.2500e-05 - duration: 14.6729
Epoch 11/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9909 - loss: 0.0502 - val_accuracy: 0.7124 - val_loss: 1.1465 - learning_rate: 3.1250e-05 - duration: 14.7515
Epoch 12/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9980 - loss: 0.0286 - val_accuracy: 0.7094 - val_loss: 1.1868 - learning_rate: 1.5625e-05 - duration: 14.7119
Epoch 13/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9994 - loss: 0.0211 - val_accuracy: 0.7083 - val_loss: 1.2115 - learning_rate: 7.8125e-06 - duration: 14.7866
Epoch 14/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9994 - loss: 0.0184 - val_accuracy: 0.7079 - val_loss: 1.2258 - learning_rate: 3.9063e-06 - duration: 14.6602
Epoch 15/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.9994 - loss: 0.0163 - val_accuracy: 0.7071 - val_loss: 1.2352 - learning_rate: 1.9531e-06 - duration: 14.7323
In [ ]:
plot_convergence(history=history_s_x)
In [ ]:
history_sa_x = run_xception('small', sizes, dropout, callbacks=callbacks, epochs=15, augmentation_layers=image_aug_layers)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 38s 54ms/step - accuracy: 0.4996 - loss: 1.0060 - val_accuracy: 0.5803 - val_loss: 0.9845 - learning_rate: 0.0010 - duration: 37.9173
Epoch 2/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.6110 - loss: 0.8246 - val_accuracy: 0.6094 - val_loss: 0.9299 - learning_rate: 0.0010 - duration: 15.0981
Epoch 3/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.6574 - loss: 0.7565 - val_accuracy: 0.6075 - val_loss: 1.0803 - learning_rate: 0.0010 - duration: 15.0955
Epoch 4/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.6926 - loss: 0.6923 - val_accuracy: 0.6736 - val_loss: 0.7361 - learning_rate: 5.0000e-04 - duration: 15.0975
Epoch 5/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7104 - loss: 0.6584 - val_accuracy: 0.6906 - val_loss: 0.6770 - learning_rate: 5.0000e-04 - duration: 15.0801
Epoch 6/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7165 - loss: 0.6451 - val_accuracy: 0.7223 - val_loss: 0.6491 - learning_rate: 5.0000e-04 - duration: 15.0748
Epoch 7/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7272 - loss: 0.6336 - val_accuracy: 0.7239 - val_loss: 0.6414 - learning_rate: 5.0000e-04 - duration: 15.1659
Epoch 8/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7317 - loss: 0.6238 - val_accuracy: 0.6919 - val_loss: 0.7155 - learning_rate: 5.0000e-04 - duration: 15.0455
Epoch 9/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7425 - loss: 0.5980 - val_accuracy: 0.7331 - val_loss: 0.6107 - learning_rate: 2.5000e-04 - duration: 15.0835
Epoch 10/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7440 - loss: 0.5817 - val_accuracy: 0.7199 - val_loss: 0.6404 - learning_rate: 2.5000e-04 - duration: 15.0848
Epoch 11/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7491 - loss: 0.5680 - val_accuracy: 0.7382 - val_loss: 0.5969 - learning_rate: 1.2500e-04 - duration: 15.0198
Epoch 12/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7583 - loss: 0.5544 - val_accuracy: 0.7348 - val_loss: 0.5954 - learning_rate: 1.2500e-04 - duration: 15.1072
Epoch 13/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 32ms/step - accuracy: 0.7553 - loss: 0.5542 - val_accuracy: 0.7487 - val_loss: 0.5831 - learning_rate: 1.2500e-04 - duration: 14.9767
Epoch 14/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7620 - loss: 0.5421 - val_accuracy: 0.7350 - val_loss: 0.6118 - learning_rate: 1.2500e-04 - duration: 15.2108
Epoch 15/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 15s 33ms/step - accuracy: 0.7634 - loss: 0.5376 - val_accuracy: 0.7425 - val_loss: 0.5795 - learning_rate: 6.2500e-05 - duration: 15.0119
In [ ]:
plot_convergence(history=history_sa_x)
In [ ]:
history_m_x = run_xception('medium', sizes, dropout, callbacks=callbacks, epochs=15)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 178s 293ms/step - accuracy: 0.4927 - loss: 1.0053 - val_accuracy: 0.3584 - val_loss: 1.2137 - learning_rate: 0.0010 - duration: 178.4982
Epoch 2/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.5774 - loss: 0.8854 - val_accuracy: 0.5031 - val_loss: 1.2167 - learning_rate: 0.0010 - duration: 114.2969
Epoch 3/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.6452 - loss: 0.7931 - val_accuracy: 0.5808 - val_loss: 1.0220 - learning_rate: 5.0000e-04 - duration: 114.2027
Epoch 4/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.6814 - loss: 0.7296 - val_accuracy: 0.6914 - val_loss: 0.7279 - learning_rate: 5.0000e-04 - duration: 114.2328
Epoch 5/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.7021 - loss: 0.6775 - val_accuracy: 0.6621 - val_loss: 0.7734 - learning_rate: 5.0000e-04 - duration: 114.3001
Epoch 6/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.7330 - loss: 0.6199 - val_accuracy: 0.7197 - val_loss: 0.6457 - learning_rate: 2.5000e-04 - duration: 114.3581
Epoch 7/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 123s 269ms/step - accuracy: 0.7483 - loss: 0.5913 - val_accuracy: 0.7183 - val_loss: 0.6489 - learning_rate: 2.5000e-04 - duration: 123.4555
Epoch 8/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.7606 - loss: 0.5512 - val_accuracy: 0.7062 - val_loss: 0.6764 - learning_rate: 1.2500e-04 - duration: 114.3073
Epoch 9/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.7832 - loss: 0.5095 - val_accuracy: 0.7183 - val_loss: 0.6547 - learning_rate: 6.2500e-05 - duration: 114.2931
Epoch 10/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.7980 - loss: 0.4771 - val_accuracy: 0.7355 - val_loss: 0.6210 - learning_rate: 3.1250e-05 - duration: 114.3039
Epoch 11/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.8125 - loss: 0.4544 - val_accuracy: 0.7336 - val_loss: 0.6345 - learning_rate: 3.1250e-05 - duration: 114.2473
Epoch 12/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.8199 - loss: 0.4327 - val_accuracy: 0.7382 - val_loss: 0.6359 - learning_rate: 1.5625e-05 - duration: 114.2918
Epoch 13/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.8243 - loss: 0.4197 - val_accuracy: 0.7367 - val_loss: 0.6392 - learning_rate: 7.8125e-06 - duration: 114.2605
Epoch 14/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.8311 - loss: 0.4128 - val_accuracy: 0.7353 - val_loss: 0.6403 - learning_rate: 3.9063e-06 - duration: 114.2684
Epoch 15/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 114s 249ms/step - accuracy: 0.8319 - loss: 0.4090 - val_accuracy: 0.7367 - val_loss: 0.6407 - learning_rate: 1.9531e-06 - duration: 114.2778
In [ ]:
plot_convergence(history=history_m_x)
In [ ]:
history_ma_x = run_xception('medium', sizes, dropout, callbacks=callbacks, epochs=15, augmentation_layers=image_aug_layers)
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Epoch 1/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 138s 271ms/step - accuracy: 0.4716 - loss: 1.0387 - val_accuracy: 0.3408 - val_loss: 1.3152 - learning_rate: 0.0010 - duration: 138.3955
Epoch 2/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.5391 - loss: 0.9308 - val_accuracy: 0.4350 - val_loss: 1.5449 - learning_rate: 0.0010 - duration: 115.2725
Epoch 3/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.5968 - loss: 0.8600 - val_accuracy: 0.5534 - val_loss: 1.0207 - learning_rate: 5.0000e-04 - duration: 115.1741
Epoch 4/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 116s 251ms/step - accuracy: 0.6288 - loss: 0.8148 - val_accuracy: 0.6436 - val_loss: 0.7898 - learning_rate: 5.0000e-04 - duration: 115.6446
Epoch 5/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 116s 251ms/step - accuracy: 0.6475 - loss: 0.7724 - val_accuracy: 0.5841 - val_loss: 0.8325 - learning_rate: 5.0000e-04 - duration: 115.5009
Epoch 6/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.6898 - loss: 0.7028 - val_accuracy: 0.6978 - val_loss: 0.6796 - learning_rate: 2.5000e-04 - duration: 115.3124
Epoch 7/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 142s 250ms/step - accuracy: 0.7079 - loss: 0.6679 - val_accuracy: 0.6837 - val_loss: 0.7202 - learning_rate: 2.5000e-04 - duration: 141.9756
Epoch 8/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.7278 - loss: 0.6320 - val_accuracy: 0.7404 - val_loss: 0.6079 - learning_rate: 1.2500e-04 - duration: 115.3455
Epoch 9/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.7338 - loss: 0.6148 - val_accuracy: 0.7186 - val_loss: 0.6384 - learning_rate: 1.2500e-04 - duration: 115.4247
Epoch 10/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.7393 - loss: 0.5986 - val_accuracy: 0.7512 - val_loss: 0.5737 - learning_rate: 6.2500e-05 - duration: 115.3204
Epoch 11/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 142s 250ms/step - accuracy: 0.7396 - loss: 0.5929 - val_accuracy: 0.7498 - val_loss: 0.5714 - learning_rate: 6.2500e-05 - duration: 141.8744
Epoch 12/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 116s 251ms/step - accuracy: 0.7454 - loss: 0.5769 - val_accuracy: 0.7565 - val_loss: 0.5709 - learning_rate: 6.2500e-05 - duration: 115.5681
Epoch 13/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.7466 - loss: 0.5769 - val_accuracy: 0.7530 - val_loss: 0.5657 - learning_rate: 6.2500e-05 - duration: 115.1872
Epoch 14/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.7479 - loss: 0.5773 - val_accuracy: 0.7587 - val_loss: 0.5549 - learning_rate: 6.2500e-05 - duration: 115.2360
Epoch 15/15
459/459 ━━━━━━━━━━━━━━━━━━━━ 115s 250ms/step - accuracy: 0.7504 - loss: 0.5661 - val_accuracy: 0.7543 - val_loss: 0.5615 - learning_rate: 6.2500e-05 - duration: 115.3780
In [ ]:
plot_convergence(history=history_ma_x)

Training evaluation¶

In [ ]:
def get_history_means(history):
    if isinstance(history, dict):
        hist = history
    else:
        hist = history.history
    res = dict(
        epoch_counts = len(hist.get('accuracy')),
        max_val_acc = np.round(max(hist.get('val_accuracy')), 4),
        min_val_loss = np.round(min(hist.get('val_loss')), 4),
        min_lr = np.round(min(hist.get('learning_rate')), 4),
        tot_duration = np.sum(hist.get('duration')).astype(int),
    )
    return res 
In [ ]:
histories = [history_s, 
             history_sa, 
             history_m, 
             history_ma, 
             history_s_x, 
             history_sa_x, 
             history_m_x, 
             history_ma_x]

model = ['CNN'] * 4 + ['Xception'] * 4
input_data = ['small', 'small', 'medium', 'medium'] * 2
augmentation = ['False', 'True'] * 4

histories_data = list(map(get_history_means, histories))
epoch_counts = list(map(lambda x: x.get('epoch_counts'), histories_data))
max_val_acc = list(map(lambda x: x.get('max_val_acc'), histories_data))
min_val_loss = list(map(lambda x: x.get('min_val_loss'), histories_data))
tot_duration = list(map(lambda x: x.get('tot_duration'), histories_data))

header = dict(values=['Model type', 
                      'Input data', 
                      'Augmentation',
                      'Epoch counts',
                      'Max accuracy (val)', 
                      'Min loss (val)', 
                      'Total duration (s)'
                     ])

cells = dict(values=[model,
                     input_data, 
                     augmentation, 
                     epoch_counts, 
                     max_val_acc, 
                     min_val_loss, 
                     tot_duration
                    ])

CNN

Before training on all the image shapes with and without data augmentation, I tried several architecture for the CNN. I diceded to keep this one, because it gives the best results.

Once the architecture choosed, I tried small (69x69) and medium (227x227) images with and without data augmentation.

Here are a few considerations about the CNN networks trainings:

  • Data augmentation helps. For example with the small dataset (69x69) the training process ends with better results using data augmentation;
  • Data augmentation increase training time. As wou can see in the following graphic, the training time increase with data augmentation. I tried to run data augmentation inside and outside the model, and the best results (in time) were when using data augmentation directly in the model;
  • Shape highly inpact training time ! Also in the grahic bellow, you can see that the bigger the image, the longer the training time;
  • Some of the models didn't converged tottaly, we'll choose one model and run it with more epoch;

Xception

In general this architecture performed better then the simple CNN architecture. The observations we maid about CNN remain valids; data augmentation increase running time & large images take longer for training. We can also see that we have overfitting in the models without data augmentation.

In [ ]:
table = go.Figure(data=[go.Table(header=header, cells=cells)])
table.update_layout(paper_bgcolor='rgba(0,0,0,0)',
                    plot_bgcolor='rgba(0,0,0,0)', 
                    margin=dict(l=0, r=0, t=0, b=0),
                    height=205
                   )
table.show()

The model with the best ratio between validation accuracy and duration is the Xception model with small images and data augmentation. I will try to push this model a bit further.

Final model¶

In [ ]:
def plot_results(y_pred, y_true, title='Prediction stats', average='binary'): 
    """
    Plot the results
    """
    cm = confusion_matrix(y_pred=y_pred, y_true=y_true, )
    ac = accuracy_score(y_pred=y_pred, y_true=y_true)
    f1 = f1_score(y_pred=y_pred, y_true=y_true, average=average)
    recall = recall_score(y_pred=y_pred, y_true=y_true, average=average)
    precision = precision_score(y_pred=y_pred, y_true=y_true, average=average)
    
    fig = make_subplots(
        rows=1, 
        cols=2, 
        column_widths=[0.75, 0.25]
    )
    fig.add_trace(
        go.Heatmap(
            z=cm,
            x=CATS,
            y=CATS,
            showscale=True, 
            transpose=False,
            text=cm,
            textfont={"size": 30}, 
            texttemplate="%{text}",
            colorscale='viridis',
            xgap=8, 
            ygap=8, 
        ), 
        row=1, 
        col=2, 
        
    )
    
    fig.add_trace(
        go.Bar(
            x=['Accuracy', "F1 score", "Recall", "Precision"], 
            y=[ac, f1, recall, precision], 
            marker=dict(
                color=[ac, f1, recall, precision], 
                colorscale='Bluered', 
                cmin=0, 
                cmax=1, 
                reversescale=True
            ),
            text=[ac, f1, recall, precision], 
            texttemplate="%{text:.4f}",  
            textfont={"size": 30}
        ), 
        row=1, 
        col=1
    )
    
    fig.update_layout(
        yaxis1_range=[0, 1],
        xaxis2_title="Predicted", 
        yaxis2_title="Expected", 
        title=title + f', (average = {average})', 
        template='plotly_white'
        
    )
    
    return fig 

def make_predictions(trained_model, test_ds):
    predictions = trained_model.predict(test_ds)
    pred_soft = tf.nn.softmax(predictions)
    y_pred = np.argmax(pred_soft, axis=1)
    y_true = np.concatenate([y for x, y in test_ds], axis=0) 
    return y_pred, y_true
    
In [ ]:
test_ds = keras.utils.image_dataset_from_directory(TEST_PATH.joinpath('small'), 
                                                   labels='inferred', 
                                                   image_size=(69, 69), shuffle=False) 
test_ds.class_names
Found 8976 files belonging to 3 classes.
Out[ ]:
['E', 'S', 'SB']
In [ ]:
train, val = keras.utils.image_dataset_from_directory(TRAIN_PATH.joinpath('small'), 
                                                       labels='inferred', 
                                                       image_size=(69, 69), 
                                                       shuffle=True, 
                                                       subset='both',
                                                       validation_split=TRAIN_TEST_SPLIT, 
                                                       seed=42
                                                      ) 
train.class_names
Found 20946 files belonging to 3 classes.
Using 14663 files for training.
Using 6283 files for validation.
Out[ ]:
['E', 'S', 'SB']
In [ ]:
train = train.cache().prefetch(buffer_size=AUTOTUNE)
val = val.cache().prefetch(buffer_size=AUTOTUNE)

Test on new data¶

In [ ]:
image_aug_layers = [
    layers.RandomRotation(0.2), 
    layers.RandomZoom(0.3)
]
sizes = [128, 256, 512, 728, 1024]
dropout = 0.2

reduce_lr = ReduceLROnPlateau(patience=1, factor=0.5, min_lr=1e-6)
timestp = TimestampCallback()
early = EarlyStopping(patience=4, restore_best_weights=True, verbose=1)

metrics = ['accuracy']
callbacks = [reduce_lr, timestp, early]
In [ ]:
model = xception_model('small',
                       sizes, 
                       dropout = dropout, 
                       rescaling = True, 
                       augmentation_layers = image_aug_layers
                      )
    
model.compile(optimizer=Adam(0.001), 
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), 
              metrics=metrics)

history= model.fit(train, 
                   validation_data=val,
                   callbacks=callbacks, 
                   epochs=20)
Epoch 1/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 38s 62ms/step - accuracy: 0.4935 - loss: 1.0088 - val_accuracy: 0.5598 - val_loss: 0.9911 - learning_rate: 0.0010 - duration: 37.5138
Epoch 2/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.5974 - loss: 0.8377 - val_accuracy: 0.6185 - val_loss: 0.9648 - learning_rate: 0.0010 - duration: 26.5855
Epoch 3/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.6501 - loss: 0.7712 - val_accuracy: 0.5988 - val_loss: 1.0398 - learning_rate: 0.0010 - duration: 26.5009
Epoch 4/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.6915 - loss: 0.6992 - val_accuracy: 0.6721 - val_loss: 0.7377 - learning_rate: 5.0000e-04 - duration: 26.7738
Epoch 5/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7068 - loss: 0.6682 - val_accuracy: 0.7121 - val_loss: 0.6598 - learning_rate: 5.0000e-04 - duration: 26.9327
Epoch 6/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 41s 58ms/step - accuracy: 0.7198 - loss: 0.6440 - val_accuracy: 0.6772 - val_loss: 0.7370 - learning_rate: 5.0000e-04 - duration: 40.7627
Epoch 7/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7265 - loss: 0.6194 - val_accuracy: 0.7301 - val_loss: 0.6219 - learning_rate: 2.5000e-04 - duration: 26.9575
Epoch 8/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7331 - loss: 0.6082 - val_accuracy: 0.7282 - val_loss: 0.6266 - learning_rate: 2.5000e-04 - duration: 27.0895
Epoch 9/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7458 - loss: 0.5930 - val_accuracy: 0.7184 - val_loss: 0.6288 - learning_rate: 1.2500e-04 - duration: 26.7766
Epoch 10/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7537 - loss: 0.5680 - val_accuracy: 0.7240 - val_loss: 0.6134 - learning_rate: 6.2500e-05 - duration: 26.8254
Epoch 11/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7522 - loss: 0.5618 - val_accuracy: 0.7239 - val_loss: 0.6197 - learning_rate: 6.2500e-05 - duration: 26.7779
Epoch 12/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7569 - loss: 0.5561 - val_accuracy: 0.7399 - val_loss: 0.5926 - learning_rate: 3.1250e-05 - duration: 26.7953
Epoch 13/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7537 - loss: 0.5525 - val_accuracy: 0.7340 - val_loss: 0.5990 - learning_rate: 3.1250e-05 - duration: 26.7822
Epoch 14/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7555 - loss: 0.5495 - val_accuracy: 0.7434 - val_loss: 0.5856 - learning_rate: 1.5625e-05 - duration: 26.8164
Epoch 15/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7583 - loss: 0.5496 - val_accuracy: 0.7425 - val_loss: 0.5826 - learning_rate: 1.5625e-05 - duration: 26.8653
Epoch 16/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7604 - loss: 0.5408 - val_accuracy: 0.7436 - val_loss: 0.5848 - learning_rate: 1.5625e-05 - duration: 26.9197
Epoch 17/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7626 - loss: 0.5425 - val_accuracy: 0.7504 - val_loss: 0.5748 - learning_rate: 7.8125e-06 - duration: 26.9003
Epoch 18/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7597 - loss: 0.5440 - val_accuracy: 0.7473 - val_loss: 0.5774 - learning_rate: 7.8125e-06 - duration: 26.7344
Epoch 19/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7604 - loss: 0.5419 - val_accuracy: 0.7504 - val_loss: 0.5742 - learning_rate: 3.9063e-06 - duration: 26.8792
Epoch 20/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7646 - loss: 0.5406 - val_accuracy: 0.7519 - val_loss: 0.5724 - learning_rate: 3.9063e-06 - duration: 26.8689
Restoring model weights from the end of the best epoch: 20.
In [ ]:
plot_convergence(history=history, title=f"Final model, dropout = {dropout}")
In [ ]:
y_pred, y_true = make_predictions(model, test_ds)
plot_results(y_pred, y_true, title=f'Prediction stats, dropout = {dropout}', average='macro')
281/281 ━━━━━━━━━━━━━━━━━━━━ 4s 13ms/step
In [ ]:
dropout = 0.3
reduce_lr = ReduceLROnPlateau(patience=1, factor=0.5, min_lr=1e-6)
timestp = TimestampCallback()
early = EarlyStopping(patience=4, restore_best_weights=True, verbose=1)

metrics = ['accuracy']
callbacks = [reduce_lr, timestp, early]
In [ ]:
model_1 = xception_model('small',
                       sizes, 
                       dropout = dropout, 
                       rescaling = True, 
                       augmentation_layers = image_aug_layers
                      )
    
model_1.compile(optimizer=Adam(0.001), 
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), 
              metrics=metrics)

history_1= model_1.fit(train, 
                  validation_data=val,
                  callbacks=callbacks, 
                  epochs=20)
Epoch 1/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 34s 59ms/step - accuracy: 0.5104 - loss: 0.9970 - val_accuracy: 0.5329 - val_loss: 1.0005 - learning_rate: 0.0010 - duration: 33.9762
Epoch 2/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.6117 - loss: 0.8385 - val_accuracy: 0.5139 - val_loss: 1.4021 - learning_rate: 0.0010 - duration: 26.6046
Epoch 3/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.6596 - loss: 0.7552 - val_accuracy: 0.6836 - val_loss: 0.7100 - learning_rate: 5.0000e-04 - duration: 26.7069
Epoch 4/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.6908 - loss: 0.6990 - val_accuracy: 0.6755 - val_loss: 0.7698 - learning_rate: 5.0000e-04 - duration: 26.6636
Epoch 5/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7095 - loss: 0.6586 - val_accuracy: 0.6828 - val_loss: 0.7159 - learning_rate: 2.5000e-04 - duration: 26.7007
Epoch 6/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7277 - loss: 0.6228 - val_accuracy: 0.6892 - val_loss: 0.6967 - learning_rate: 1.2500e-04 - duration: 26.7075
Epoch 7/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7342 - loss: 0.6038 - val_accuracy: 0.7223 - val_loss: 0.6288 - learning_rate: 1.2500e-04 - duration: 26.6872
Epoch 8/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7388 - loss: 0.5998 - val_accuracy: 0.6951 - val_loss: 0.6886 - learning_rate: 1.2500e-04 - duration: 26.6740
Epoch 9/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7400 - loss: 0.5904 - val_accuracy: 0.6941 - val_loss: 0.6755 - learning_rate: 6.2500e-05 - duration: 26.7107
Epoch 10/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7458 - loss: 0.5784 - val_accuracy: 0.7227 - val_loss: 0.6223 - learning_rate: 3.1250e-05 - duration: 26.6249
Epoch 11/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7502 - loss: 0.5716 - val_accuracy: 0.7318 - val_loss: 0.6071 - learning_rate: 3.1250e-05 - duration: 26.7093
Epoch 12/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7554 - loss: 0.5676 - val_accuracy: 0.7307 - val_loss: 0.6074 - learning_rate: 3.1250e-05 - duration: 26.6794
Epoch 13/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7520 - loss: 0.5586 - val_accuracy: 0.7355 - val_loss: 0.5986 - learning_rate: 1.5625e-05 - duration: 26.7133
Epoch 14/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 41s 58ms/step - accuracy: 0.7558 - loss: 0.5527 - val_accuracy: 0.7350 - val_loss: 0.5970 - learning_rate: 1.5625e-05 - duration: 40.9187
Epoch 15/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7552 - loss: 0.5605 - val_accuracy: 0.7344 - val_loss: 0.5948 - learning_rate: 1.5625e-05 - duration: 26.7387
Epoch 16/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7545 - loss: 0.5620 - val_accuracy: 0.7329 - val_loss: 0.5957 - learning_rate: 1.5625e-05 - duration: 26.7127
Epoch 17/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7618 - loss: 0.5518 - val_accuracy: 0.7422 - val_loss: 0.5825 - learning_rate: 7.8125e-06 - duration: 26.8689
Epoch 18/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7557 - loss: 0.5547 - val_accuracy: 0.7403 - val_loss: 0.5864 - learning_rate: 7.8125e-06 - duration: 26.8242
Epoch 19/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 59ms/step - accuracy: 0.7573 - loss: 0.5516 - val_accuracy: 0.7441 - val_loss: 0.5780 - learning_rate: 3.9063e-06 - duration: 26.8649
Epoch 20/20
459/459 ━━━━━━━━━━━━━━━━━━━━ 27s 58ms/step - accuracy: 0.7542 - loss: 0.5538 - val_accuracy: 0.7445 - val_loss: 0.5779 - learning_rate: 3.9063e-06 - duration: 26.7374
Restoring model weights from the end of the best epoch: 20.
In [ ]:
plot_convergence(history=history_1, title=f"Final model, dropout = {dropout}")
In [ ]:
y_pred, y_true = make_predictions(model_1, test_ds)
plot_results(y_pred, y_true, title=f'Prediction stats, dropout = {dropout}', average='macro')
281/281 ━━━━━━━━━━━━━━━━━━━━ 4s 11ms/step

The two models performed well on the test data. The best model was the model with a dropout of $0.3$. If we take a look to the correlation matrix we can make some observations:

  • The classification for the galaxies of type E is the best. This make sense because visually the elitical are alredy very different from the spiral galayies.
  • The spiral galaxies S are as often classified as E then as SB.
  • The previous observation is also valid for bared spircal galaxies SB, however the false positives are less frequent.

Results post processing¶

In [ ]:
correct = y_pred == y_true
files = list(map(lambda x: x.split("/")[-1].split(".")[0], test_ds.file_paths))

df_pred = pl.DataFrame({'correct': correct, 'asset_id': files})
In [ ]:
pred_df = (
    df 
    .with_columns(pl.col('asset_id').cast(pl.Utf8))
    .join(df_pred, on='asset_id')
    .with_columns(pl.col('correct').cast(pl.UInt8))
)
pred_df.head()
Out[ ]:
shape: (5, 12)
dr7objidasset_idgz2classtotal_classificationstotal_votesagreementtargetpath_smallpath_mediumpath_largecorrect
i64i64strstri64i64f64strstrstrstru8
0587732591714893851"58957""Sc+t"453421.0"S""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…1
1588009368545984617"193641""Sb+t"423321.0"S""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…1
3587741723357282317"158501""Sc+t"282180.766954"S""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…1
10587745403080146952"187749""Er"511910.340943"E""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…1
24588295841247592461"238289""Er"331100.349646"E""/kaggle/input/…"/kaggle/input/…"/kaggle/input/…1
In [ ]:
plot_df = pred_df.group_by(['target', 'correct']).agg(pl.mean('total_votes'))

fig = go.Figure() 

for i, c in enumerate(["Bad classication", "Correct classification"]):
    d = plot_df.filter(pl.col('correct') == i).sort('target')
    x = d['target'].to_numpy()
    y = d['total_votes'].to_numpy()
    fig.add_traces(go.Bar(x=x, 
                          y=y, 
                          name=c
                         ))
    
fig.update_layout(barmode='group', 
                  title='Mean vots per target and classification success', 
                  xaxis=dict(title='Category'),
                  yaxis=dict(title='Mean votes'),                  
                 )

fig.show()

We can see that for the target with the best prediction (E) has the same mean values of votes for good and bad predictions. However this is not the case for the categories S and SB where the correct classifications have a higher mean vote rate. This may suggest that in our dataset, for correct classification on these galaxies we need more votes to find an agreement on the type of galaxies.

Conclusion¶

In this project, we went into several steps: EDA, data preparation, model training and comparisons, model selection and model evaluation. Let's resume in a few words each of these steps.

EDA

The eda consisted in the following steps:

  • Inspect the labels and input csv;
  • Restrict to the data where we have images;
  • Basic statistics on the dataset
  • Images statistics

The knowledge acquired during this step was usefull for the next steps

Data preparation In this step, we moved the files (only with symlinks) to folders in order to read them in batch with tensorflow. We applyed downsampling to use only the number of image available in the category with the less images.

Models The functions declared in this step allowed to run many training with several parameters sets. Two different architectures were build:

  • CNN
  • Xception

We also introduced data augmentation, which increased the accuracy of every models.

Model selection

Finally we selected one model of type Xception and tried to tune the dropout. We end up with a model with an accuracy of 75% which is a decent results given the quality of the ground truth.

Overall the quality of the classification is good, but could be enhanced.

Final thoughts¶

The difficulty of this project was the quality of the ground truth, which in this case may be wrong. It coulb be interesting to make a unsupervised classification to see the difference between the machine classification and the human classification. Other models may be tested as well.